Transfer learning based generalized framework for state of health estimation of Li-ion cells

Estimating the state of health (SOH) of batteries powering electronic devices in real-time while in use is a necessity. The applicability of most of the existing methods is limited to the datasets that are used to train the models. In this work, we propose a generic method for SOH estimation with much wider applicability. The key problem is the identification of the right feature set which is derived from measurable voltage signals. In this work, relative rise in voltage drop across cell resistance with aging has been used as the feature. A base artificial neural network (ANN) model has been used to map the generic relation between voltage and SOH. The base ANN model has been trained using limited battery data. Blind testing has been done on long cycle in-house data and publicly available datasets. In-house data included both laboratory and on-device data generated using various charge profiles. Transfer learning has been used for public datasets as those batteries have different physical dimensions and cell chemistry. The mean absolute error in SOH estimation is well within 2% for all test cases. The model is robust across scenarios such as cell variability, charge profile difference, and limited variation in temperature.

An increasing number of electronic devices such as electric vehicles and mobile phones rely on rechargeable batteries as the only source of power for operation. As a battery is repeatedly charged and discharged, known as cycling, the available capacity decreases continuously thereby degrading the battery. Hence, every battery traverses through a health trajectory starting from a perfectly healthy state to a completely dead state. Accurate, on-device estimation of battery state of health (SOH) is essential to monitor the battery condition. The algorithm should be robust enough to be compatible with rapidly evolving battery specifications. SOH of a battery is defined as the relative change in capacity over the charge-discharge cycles. For example, SOH of a battery at the n th cycle can be defined as where, C rated is rated/nominal capacity of the battery, and C n is the capacity of the battery after n number of cycles. SOH has to be estimated from the measured voltage and current signals.
State-of-the-art SOH estimation algorithms can be broadly categorized into model-based methods and datadriven methods 1 . Model-based techniques use estimation algorithms such as Kalman filter 2,3 , particle filter 4,5 , etc. to estimate SOH from equivalent circuit model of the battery. Data-driven methods use regular charge or discharge data and corresponding SOH to train machine learning models 1,6 . With time, data-driven methods have gained popularity due to ease of use, and advancement in the computational capability of machines. Further, the potential of data-driven methods can be enhanced as more data becomes available. This ensures improved accuracy with minimal intervention in the underlying computation.
Another set of SOH estimation methods rely on the shift in peaks from an incremental capacity (IC) analysis. A shift in peak height in the IC curve (dQ/dV vs V plot) can be observed with a decrease in SOH. This information has been exploited in 7-10 for SOH estimation. These methods have been trained and tested using data of the same battery. Other IC curve features such as peak height, peak voltage, and peak area have been used in 11 and 12 for SOH estimation. Though the authors have used exclusive test battery data for model evaluation, effect of charge profile variation on the algorithm performance has not been explored. Long-short term memory (LSTM) based SOH prediction algorithm in 1 and LSTM+ANN based algorithm proposed in 13 also use initial cycle data (1) SOH n = C n C rated ; OPEN 1 Samsung R&D Institute India-Bangalore, Bangalore 560037, India. 2 14 for the estimation of capacity degradation. In 15 , the shift in charging voltage curve due to aging has been used as feature to train support vector machine (SVM) models. Unlike most of the state-of-the-art methods, the relation between battery surface differential temperature and SOH has been used to train a support vector regression model in 16 .
All these methods have been tested on limited cell data, and their robustness to changes in battery type and charging profile has not been investigated. Constant current (CC) charge time and time to charge between two voltage levels reduce with aging. In 17 , these charge-time-based parameters have been used as input to the least square SVM for SOH estimation. However, any variation in charging current profile would change the elapsed time features thereby deviating the output of SVM.
Incremental voltage difference has been used as a feature in 18 to train a shallow ANN. However, as simulated data has been used for training and the slope of voltage curve has been used as a feature, the method might fail if charge profile or device specifications change. The energy of equal discharge voltage difference has been used in 19 as a health indicator (HI). Features extracted from the smoothed HI curve have been used in an ANN to estimate SOH. Transfer learning with convolutional neural networks (CNN) has been used in 20 . The CNN trained using accelerated aging data of cells has been fine-tuned on 15% data of unknown cells, and the rest 85% data has been used to test the algorithm. Though the methods proposed in 19 and 20 seem promising, their efficacy on data with different charge profile (other than CCCV such as multi-step CCCV (MSCCCV)) have not been explored. Transfer learning with neural network models have been used in 13,[20][21][22] for SOH estimation. However, they use transfer learning to fine-tune the offline trained models on the data of target batteries. Though, it eliminates training, but requires parameter tuning for every new cell.
Existing SOH estimation algorithms are effective in estimating SOH when the variation across battery type (battery specifications such as capacity, internal resistance, etc.) and charge profile (CCCV or MSCCCV charging) is assumed to be the same in training and testing. To the best of our knowledge, none of the methods address the challenges associated with variation in charge profile and battery specification. The existing SOH estimation methods lack extensive testing on different batteries. Most of those methods would require model tuning to estimate the SOH of an unseen cell.
To overcome these issues, the feature has to be robust. It has to be unaffected by variations in battery behavior and charge profiles. It would not require re-training or fine-tuning of SOH estimation models for every new battery. From the analysis of several Li-ion cell data, it has been observed that the voltage drop across cell resistance increases proportionately with degradation in capacity. The absolute values of voltage drop and SOH may differ for similar batteries; however, the relative change largely remains the same. Therefore, the relative increase in voltage drop with aging has been used as a feature to estimate the relative drop in battery SOH. The relation between feature and SOH is independent of C-rate and charge profile. Following are the contributions made in this paper.

1.
A novel feature has been proposed which is independent of operating conditions such as charging current profile (CCCV charging or MSCCCV charging etc.), C-rate, and limited variation in temperature. 2. A relative quantity instead of absolute values (absolute quantities might change with battery variability), obtained directly from measured quantities, has been used as feature to estimate the relative battery degradation. The relative increase in voltage drop over cycles has been mapped to the relative decrease in SOH. 3. The model is trained or fine-tuned using only one cell data of a particular chemistry and physical dimension. After that, it can be used for other cells irrespective of charge profile, C-rate, and limited variation in temperature. 4. Extensive testing of the proposed method has been done using both offline and online battery data having different charge profile, capacity, and operating temperature. The SOH estimation error remains low in all test scenarios.

Results
Feature selection. The robustness and accuracy of SOH estimation largely depend on the effectiveness of extracted features. A thorough study of literature indicates that SOH features lack robustness to change in battery operating conditions such as current rate, charging current profile, temperature, etc. For example, time stamp to charge a battery between two voltage intervals 13 will change if the C-rate or charge profile changes. The slope of the voltage curve used in 18 varies with SOH in the CC region. The slope variation will be negligible in the CV region, and thus would be inefficient if the charge profile changes to MSCCCV or to a CCCV profile with small CC region. A robust feature to estimate SOH from the measured voltage and current has been the need of the hour. In order to select a feature for SOH estimation, laboratory cycled data was analyzed. A resistance model ( Fig. 1) was assumed as the electrical equivalent circuit of the battery 23 . In the figure, R is the resistance, and OCV is the open-circuit voltage of the battery. V and I are charging voltage and current respectively. It was observed from the data that voltage drop across the resistance was increasing with aging. The quantity reflecting incremental voltage drop was computed by subtracting the drop across fresh cell resistance from the measured voltage. An average fresh cell resistance of 60 m was assumed as the measured resistance of fresh cells were in the range of 50-70 m . The fresh cell resistance was computed using voltage jump at the beginning of charge when step charging current was applied. www.nature.com/scientificreports/ Let V r be the quantity reflecting the increase in voltage drop across battery resistance. V r is computed by subtracting the voltage drop across the fresh battery resistance (cycle 0) from the measured terminal voltage. In other words, V r represents the combined voltage of cell OCV and the incremental voltage drop across the resistance.
where R 0 is the resistance of the fresh battery, V is the measured voltage, and I is the charging current. SOC(t) (%) at an instant t after beginning of charge was computed using current and sampling interval dt where t = 0 refers to the beginning of charge in a cycle, and C rated is the rated capacity of the battery. V r at a fixed SOC was found to be increasing with battery aging. Figure 2 shows feature plots for three different charging protocols: (a)1C-CCCV, (e)1.3C-CCCV, and (i)1C-MSCCCV. V r vs SOC at different SOH values for these 3 charging protocols have been plotted in Fig. 2b,f, and j respectively. A clear shift in V r vs SOC curves can be observed with a decrease in SOH in all three cases. Value of V r might get affected by differences in voltage and current profile and R 0 value; whereas, the relative shift in V r has been observed to be similar across batteries.  www.nature.com/scientificreports/ Therefore, unlike 18 , the relative change in V r over cycles has been selected as a feature. Let the relative shift in V r vs SOC curve with respect to that at SOH = 1 (cycle 0 or fresh battery) be denoted as V r .
where 20% < SOC i < 90% . V r c and V r 0 are V r of cycle c and 0 respectively. V r c represents V r for cycle c. V r is computed by sampling V r c and V r 0 based on SOC values. The advantage of using V r as feature is that any component of V r which remains constant with aging will be canceled in the V r computation, and any other voltage component which varies with aging will be reflected in the feature. Let SOH be the drop in SOH (from 1) corresponding to V r shift in V r . V r at 30%, 40%, and 50% SOC have been plotted against SOH in (c), (g), and (k) respectively for charge profiles in (a), (e), and (i) of Fig. 2. V r can be seen to be almost linearly varying with SOH at fixed SOC points. V r vs SOH for all 3 charge profiles have been shown in a single plot at two different SOC points in Fig. 3. It can be observed that the feature-label proportionality does not change with variation in charge profile.

SOH estimation of exclusive test battery set. A base ANN model was trained using data from 8
cells which makes about 1/4 of the total number of batteries involved in experiments. Input to the ANN is V r features computed from charging voltage and current. SOH is the output of the ANN. The same base ANN has been used in all tests described hereafter, without any further training. Batteries which were not involved in training of the base ANN were used for blind testing of the proposed algorithm. Blind testing was performed to confirm the robustness of the trained base ANN to cell variability. The method was tested on long cycled battery data to validate its performance in advanced cycles. For quantitative evaluation, the estimated SOH values were compared against the ground truth SOH computed from low current (0.2C) probe cycles.
Robustness to charge profile variability. The base ANN was tested on exclusive test battery data. Different charging protocols had been used in cycling of these batteries. Test accuracy for in-house laboratory generated data have been given in Table 1 in the form of mean absolute error (MAE), root mean square error (RMSE), and standard deviation error (SDE). MAE is well within the range of 0.02, and it is below 0.01 for most of the cases. The error is not affected by variation in charging C-rates for S1 batteries. Also, the MAE values for both CCCV and MSCCCV charge profiles are within 0.02. Estimated SOH and error of two S1 cells charged using 0.8C  www.nature.com/scientificreports/ CCCV (B3) and 1.2C CCCV (B6) profiles have been plotted in Fig. 4a,b respectively. Figure 4c shows the estimated SOH and error plot for S6-B1 which had been charged using 1C MSCCCV profiles. It can be observed that the estimated SOH is close to the ground truth even in the advanced cycles.
Validation in the presence of noise. On-device measured voltage and current have inherent noise due to associated measurement uncertainty. The effect of the noise on performance of the proposed method has been validated by adding random noise to the cycler data. The measurement uncertainty of the devices used in the experiment are in the range of ţV and ţA. The added noise amplitude was increased beyond the ţV and ţA range to measure the noise tolerance limit of the method (assuming an acceptable SOH estimation MAE 2%). Amplitude of the noise added to the cycler voltage and current were increased in steps of 10 mV and 10 mA starting from 1 mV and 1 mA respectively. The SOH estimation method was applied to the noisy data. The results indicated that the MAE hit the tolerance limit of 2% at a noise level around 100 mV, mA. The MAE vs noise amplitude plot, estimated SOH, and noisy voltage and current plots have been shown in Fig. 5. It is evident from the plots that the MAE increases with increase in noise level. If an MAE tolerance limit is set at 2%, then the method can be said to be robust to noise of 100mV and 100mA in voltage and current measurements respectively.  Robustness to on-device noise. The same base ANN was used to estimate SOH of batteries mounted on devices. An android based application was made which ran the proposed algorithm in the background while batteries were charged and discharged using device-specific chargers. Assuming that identical batteries will age at the same rate, SOH computed from laboratory cycled data of similar batteries were used as the ground truth for on-device experiments. The accuracy numbers have been shown in Table 2. MAE values are in the same range as laboratory test results. Estimated SOH and error plots of two batteries from the S1 and S2 set have been shown in Fig. 6a,b respectively.
Transfer learning for unseen dataset. As the base ANN was trained using data from multiple batteries, it is expected to work for batteries with similar V r vs SOH relation (Fig. 3). However, if the battery type or physical dimension of another set of batteries are significantly different, then the feature-label proportionality relation might change 24,25 . In such a scenario, transfer learning was used. The first fully connected layer in the base ANN extracts information from input features; the second layer maps the proportionality relation between input and output. Therefore, the first layer of the base ANN (trained using in-house data) was kept unchanged, and only the second layer was fine-tuned on single-cell data of the new set. Hyper parameters such as loss function, optimizer, etc. was the same as that used for training of the base ANN.
Generalization to different battery datasets. The proposed method was tested on two publicly available datasets: CALCE 26,27 , and NASA 28 . The batteries used in these data were of different physical dimension and cell chemistry than in-house batteries used for training of the base ANN. Therefore, the feature-SOH relation will be different and the base ANN will not work for these datasets. Transfer learning was used to make the base ANN work for the new data. The last fully-connected layer of the base ANN was fine-tuned using one randomly chosen cell data from each dataset. The rest of the batteries in the set were used for testing. CS2 batteries of CALCE data were used in the experiment. Fine-tuning of ANN was done using CS2-33 data. CS2-34, 35, 36, 37, 38 were used for blind testing. The accuracy numbers have been given in Table 3. The estimated SOH and error in estimation have been plotted in Fig. 7 for two batteries in the set. The MAE is within 0.02 for all the cases.     Battery data collected at two different temperatures were used in the experiment. Data of B05, B06, B07, and B18 were collected at room temperature. B29, B30, B31, and B32 data were collected at an elevated temperature of 43 • C. The last FC layer of the base ANN was fine-tuned on B05 data. Other cell data were used for blind testing. The accuracy numbers have been given in Table 4. The estimated SOH and error in estimation have been plotted in Fig. 8 for two batteries in the set. The MAE is within 0.02 for most of the cases except B18. It can also be observed that the feature-SOH relation is not getting affected by the variation in temperature. SOH estimation accuracy of the proposed method on CALCE and NASA data have been compared with that reported in two recently published methods 13 and 29 . The comparison has been shown in Table 5. The accuracy figures for the proposed method are comparable and in some cases better than that of state-of-the-art. Additionally, this method does not require ANN fine-tuning for every cell. It is evident that fine-tuning of only second layer of the base ANN (trained using in-house data) on one cell data of the new dataset is sufficient to test on other batteries in the dataset. The SOH estimation method reported in 13  www.nature.com/scientificreports/ Table 5 that the proposed technique has lower SDE in most of the cases as compared 13 . The better accuracy can be attributed to the feature as the neural network is much simpler as compared to the LSTM model.

Discussion
Good the feature is, robust will be the SOH estimation. In this work, a novel, efficient, and generic feature, computed solely from the measurable voltage and current, has been proposed. The feature has a proportional relation with battery SOH. Also, the feature-SOH proportionality relation has been observed to be independent of the charge profile, C-rate, and limited variation in temperature. The voltage drop across the series resistance has been observed to be increasing with the drop in battery SOH. As a result, the voltage across the resistance vs SOC curve shifts upwards. The relative voltage shift has been seen to be independent of charge profile (CCCV or MSCCCV) and C-rate. A two-layer ANN has been used to map the variation in voltage into SOH. The proportionality relation between feature and label is the same for similar types of batteries, i.e. batteries having similar physical dimension and base chemistry. Hence, the base ANN can be used to estimate the SOH of similar batteries at an MAE within 0.02. However, the proportionality relation might be different for a battery set having different physical dimension and chemistry than the training set. In that scenario, fine-tuning of the final fully connected layer of ANN has been done for SOH estimation. The use of only one battery data for fine-tuning has been observed to be sufficient for SOH estimation of other batteries with an MAE limit of 0.02.
The proposed method has been verified using 37 in-house battery data The batteries were cycled in laboratory and user devices using different charge profiles and C-rates. Test on the exclusive battery set resulted in SOH estimation MAE within 0.02. The method has also been tested on two publicly available battery datasets: CALCE and NASA. SOH estimation error for almost all the batteries of the two datasets was within 0.02. Test on NASA battery data cycled at room temperature and 43 °C confirmed the robustness of the proposed method to temperature variation. Following are the advantages of the proposed method compared to deep learning methods.
1. It involves less computation as network is shallow. It can be used on edge devices for real time battery health monitoring. 2. It is a generic method. The model does not require retraining or fine-tuning for every new cell. Transfer learning is done only once when the battery chemistry and dimension changes. 3. The method is robust to variation in charge profile, C-rate, and temperature.

Methods
This section describes the SOH estimation technique. The experimental data generation process has been explained in detail, followed by feature selection criteria. The feature has been selected based on the analysis of experimentally generated data. The ANN used for SOH mapping has been described next. Steps to estimate SOH and the evaluation metrics have been outlined at the end of this section. Table 6 were used in the experiment. The basic chemistry for all these batteries was the same. It was an LCO cathode-Graphite anode pouch cell configuration. Depending on the manufacturer, there are variations in battery raw material, manufacturing process, additives, etc. As a result, the behavior of batteries such as resistance magnitude and SOC-OCV profile changes. To generate data for algorithm development and validation, 23 batteries were cycled in the laboratory. The batteries were charged and discharged repeatedly while allowing it to rest (current = 0) for 10 min between every charge and discharge. Current, voltage, and timestamps were recorded during the cycling process. For real-time validation of the algorithm, 14 batteries were cycled in commercial devices by replicating user charge-discharge scenario. Battery specifications such as capacity and charge protocol have also been indicated in Table 6. All S1 batteries have been cycled at different current rates (0.6C, 0.7C, 0.8C, 0.9C, 1C, 1.1C, 1.2C, 1.3C).

Experimental data generation. Multiple cells of different capacity and charge protocols as listed in
In order to get the ground truth SOH for each battery cycled in the laboratory, a low current probe cycle was repeated after every 50 cycles. In the probe cycle, batteries were CCCV charged and CC discharged at 0.2C. The capacity of a probe cycle was computed by coulomb counting. where SOH c+i denotes the SOH for cycle number c + i , and 1 < i < 49.
ANN used for SOH mapping. A simple two-layer ANN has been used to model the relation between V r and SOH . The feature label relation is largely linear (Fig. 3). We have observed through experimental validation that the ANN outperforms linear regression model when used on batteries of unseen devices. Additionally, transfer learning is possible using ANNs, where, the model generalizes on new data while retaining its earlier learning. The ANN has been shown in Fig. 9. As shown in the figure, input to the ANN has dimension 1x10. A hidden dense layer of 10x10 has been used which takes a 1 x 10 input and produces a hidden output of dimension 1 x 10. Then, a second dense layer of 10x1 takes the hidden layer output as input and computes the SOH value. Ten consecutive samples of V r computed at a gap of 2% SOC constitutes one feature vector. A set of feature vectors computed in the SOC range 30-50% at different SOH values have been shown in (d), (h), and (l)

SOH estimation method.
A flowchart of the steps followed in the proposed SOH estimation method has been depicted in Fig. 10. The method primarily consists of two steps: offline training or fine-tuning of ANN, and online estimation of SOH.
Feature extraction. Following are the steps to compute feature vectors from the measured voltage and current in a particular cycle.
1. SOC is computed using current and sampling interval dt as shown in (3). V r is computed using voltage and current as depicted in (2). 2. V r 0 and SOC 0 of the first cycle of the fresh battery are saved for computation of V r in later cycles.
Training of base ANN. The base ANN model was trained using data from multiple batteries. One cell from each set listed in Table 6 was chosen to train the ANN. Cells were randomly selected from each set. Feature vectors were computed as described in previous section. Total 62512 feature vectors were extracted from the charging data of those 8 cells. SOH labels for the FVs were computed from the ground truth SOH values obtained using probe cycles.
The extracted feature and label set were randomly split into 80:20 ratio for training and validation. A windows machine having Intel core i7 processor was used for training. Code was written in the PyTorch framework. The training was run for 50 epochs with Adam optimizer and L1 loss function. A learning rate of 0.01 was used. The model was tested on the validation set after every epoch. The best model with minimum loss was saved.
Transfer learning for new dataset. The feature label proportionality relation will change when battery chemistry or physical dimension change 24,25 . The ANN model has to be fine-tuned to adapt to the new dataset. The first layer of the base ANN is kept unchanged. The second layer is fine-tuned using only one cell data of the new set. Same training hyper parameters such as loss function, learning rate, number of epochs, optimizer, etc. were used for model tuning.
SOH estimation of an unseen battery. Cells that were excluded during training or fine-tuning were used for blind testing of the learned ANN model. Following are the steps to estimate SOH from charging data of a particular cycle.
1. Feature vectors are computed from charging voltage and current in a cycle c. 2. All feature vectors are passed through ANN to get SOH values.